Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Chinese-Vietnamese pseudo-parallel corpus generation based on monolingual language model
JIA Chengxun, LAI Hua, YU Zhengtao, WEN Yonghua, YU Zhiqiang
Journal of Computer Applications    2021, 41 (6): 1652-1658.   DOI: 10.11772/j.issn.1001-9081.2020071017
Abstract332)      PDF (1333KB)(303)       Save
Neural machine translation achieves good translation results on resource-rich languages, but due to data scarcity, it performs poorly on low-resource language pairs such as Chinese-Vietnamese. At present, one of the most effective ways to alleviate this problem is to use existing resources to generate pseudo-parallel data. Considering the availability of monolingual data, based on the back-translation method, firstly the language model trained by a large amount of monolingual data was fused with the neural machine translation model. Then, the language features were integrated into the language model in the back-translation process to generate more standardized and better quality pseudo-parallel data. Finally, the generated corpus was added to the original small-scale corpus to train the final translation model. Experimental results on the Chinese-Vietnamese translation tasks show that compared with the ordinary back-translation methods, the Chinese-Vietnamese neural machine translation has the BiLingual Evaluation Understudy (BLEU) value improved by 1.41 percentage points by fusing the pseudo-parallel data generated by the language model.
Reference | Related Articles | Metrics
Recognition of Chinese news event correlation based on grey relational analysis
LIU Panpan, HONG Xudong, GUO Jianyi, YU Zhengtao, WEN Yonghua, CHEN Wei
Journal of Computer Applications    2016, 36 (2): 408-413.   DOI: 10.11772/j.issn.1001-9081.2016.02.0408
Abstract407)      PDF (895KB)(883)       Save
Concerning the low accuracy of identifying relevant Chinese events, a correlation recognition algorithm for Chinese news events based on Grey Relational Analysis (GRA) was proposed, which is a multiple factor analysis method. Firstly, three factors that affect the event correlation, including co-occurrence of triggers, shared nouns between events and the similarity of the event sentences, were proposed through analyzing the characteristics of Chinese news events. Secondly, the three factors were quantified and the influence weights of them were calculated. Finally, GRA was used to combine the three factors, and the GRA model between events was established to realize event correlation recognition. The experimental results show that the three factors for event correlation recognition are effective, and compared with the method only using one influence factor, the proposed algorithm improves the accuracy of event correlation recognition.
Reference | Related Articles | Metrics